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ABSTRACT 

A complex of the three [(xsO) core subunits and the P2 
sliding clamp is responsible for DNA synthesis by 
Pol III, the Escherichia coli chromosomal DNA rep- 
licase. The 1.7 A crystal structure of a complex 
between the PHP domain of a (polymerase) and 
the C-terminal segment of s (proofreading exonucle- 
ase) subunits shows that s is attached to a at a site 
far from the polymerase active site. Both a and s 
contain clamp-binding motifs (CBMs) that interact 
simultaneously with P2 in the polymerization mode 
of DNA replication by Pol III. Strengthening of both 
CBMs enables isolation of stable (xsO:P2 complexes. 
Nuclear magnetic resonance experiments with 
reconstituted gcs0:P2 demonstrate retention of high 
mobility of a segment of 22 residues in the linker 
that connects the exonuclease domain of s with its 
a-binding segment. In spite of this, small-angle 
X-ray scattering data show that the isolated 
complex with strengthened CBMs has a compact, 
but still flexible, structure. Photo-crosslinking with 
p-benzoyl-L-phenylalanine incorporated at different 
sites in the a-PHP domain confirm the conform- 
ational variability of the tether. Structural models 
of the GcsO:P2 replicase complex with primer- 
template DNA combine all available structural data. 

INTRODUCTION 

The replicative DNA polymerases that synthesize the bulk of 
chromosomal DNA invariably contain two active sites. 
Primer DNA is extended processively by incorporation of 



nucleotides at the polymerase site, while mismatched nucleo- 
tides that are incorporated infrequently are removed at the 
3^-5^ (proofreading) exonuclease site. In all proofreading 
polymerases, the two sites are spatially separated, so a mech- 
anism is required to transfer the primer-template DNA from 
one site to the other when the polymerase needs to transit 
between the polymerization and proofreading modes (1). 

The 17-subunit DNA polymerase III holoenzyme (Pol 
III HE) is the chromosomal repHcase in Escherichia coli, 
and is composed of 10 different proteins (1). The three- 
subunit catalytic core contains one each of the a (1160 
residues; ISOkDa), g (243 residues; 27kDa) and 0 
(8.8 kDa) subunits encoded by the dnaE, dnaQ and holE 
genes, respectively. The a subunit contains the polymerase 
active site (2,3), the g subunit is responsible for the 3-5^ 
proofreading exonuclease activity (4) and the 0 subunit 
has no identified enzymatic activity (5). The aGO core 
complex is active alone as a proofreading DNA polymer- 
ase, and co-purification of these three subunits demon- 
strates their tight physical association (6,7). Direct 
interactions between g and a (8) and g and 0 (5) have 
been demonstrated, but no interaction has been detected 
between a and 0. 

The aGO core complex of E. coli DNA Pol III has proven 
unsuitable for X-ray crystallography, probably because 
crystallization is impeded by the highly flexible polypep- 
tide linker connecting the globular N-terminal domain of g 
with its C-terminal peptide that binds to a (9). The 
complex is also not amenable to detailed NMR studies 
because of its high molecular mass. However, 3D struc- 
tures of subdomains of the complex have been determined 
by X-ray crystallography and in solution by NMR 
spectroscopy. 

Two crystal structures of a have been reported: of a 
C-terminally truncated version (a917) of E. coli a (10) 
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and of full-length Thermus aquaticus (Taq) a (11). Crystal 
structures are also available of the N-terminal globular 
domain of E. coli s (£l86; Figure lA), both alone (12) 
and in complex with HOT, the phage PI homolog of 0 
(13,14). In addition, the structure of the el 86:^ complex 
was determined by NMR spectroscopy (15,16). Full- 
length 8 includes an additional C-terminal segment of 57 
residues, in the following referred to as eCTS (Figure lA). 
Residues near the C terminus of the sCTS are known to be 
responsible for binding of a (9,17,18). The binding site of s 
has been shown to be within the first 320 residues of a 
(19), which includes the N-terminal 270-residue PHP 
domain (20) (Figure IC and D). Residues 190-212 of 
the eCTS had earlier been predicted to comprise an 
interdomain 'Q-Hnker' sequence (21). Solution NMR 
showed that residues in the eCTS are flexible and that 
those around the Q-linker (residues between Thrl83 and 
Thr201, at least) remain so even in the context of the 
165 kDa aaO core complex. Moreover, the el 86 domain 
does not interact, even weakly, with a (9). 

The homodimeric P2 sliding clamp is responsible for 
processivity of DNA synthesis by the bacterial repHsome. 
The ^2 dimer forms a donut-shaped structure (22) that 
encircles (23) and slides on double-stranded (ds) DNA. 
Each protomer of has a binding site for penta- or 
hexa-peptide clamp-binding motifs (CBMs) that are 
found in many proteins, including the 8 subunit of the 
seven-subunit Pol III clamp loader and all five E. coli 
DNA polymerases (I-V) (24), among others (25). The 
interaction of CBMs with the P2 sHding clamp provides 
a specific way of recruiting requisite enzymes to 3^ ends of 
primer-template DNAs, and the Pol III a subunit has two 
CBMs: one is at the C-terminus and may be involved in 
polymerase recycling during lagging-strand replication; 
the other is an internal site that ensures processivity of 
the replicase (25,26). 

In the present work, we identified the exact site and 
mode of binding of £ on a by determining the crystal 
structures of constructs where residues 209-243 and 
200-243 of fiCTS were fused to the N-terminus of the 
PHP domain of a (residues 1-270, referred to as allO) 
via a nine-residue linker that had been shown in another 
context to be flexible (27,28). A fortuitous PCR-generated 
mutation in allO, Leu21Pro, enabled crystallization. As 
the binding site of £ on ckf turned out to be far from the 
active site of the polymerase, we further investigated the 
tether between the N-terminal proofreading domain of £ 
and the C-terminal a-binding peptide. Building a model of 
the asO:P2 complex with primer- template DNA using this 
new information and published crystal and NMR struc- 
tures of the various protein components indicates that the 
tether is sufficiently long to bring the exonuclease domain 
of G closer to the active site of the a polymerase subunit 
when proofreading is required. The model positions two 
CBMs on the separate subunits of the ^2 clamp, one being 
the internal CBM of a and the other the weakly binding 
CBM just beyond the C-terminus of the exonuclease 
domain of s (25). Mutations of both CBMs for tighter 
binding to P produced an a£0:fi2 complex that was stable 
enough to be isolated chromatographically and used to 
collect small-angle X-ray scattering (SAXS) data that are 



consistent with the model. NMR measurements showed 
that the tether in the fiCTS is nevertheless still flexible in 
a similar complex. In agreement with the model, 
/>-benzoyl-L-phenylalanine (Bpa) residues site-specifically 
incorporated in allO were found to afford photo- 
crosslinking to the eCTS, in particular at sites located 
closer to the active site of the polymerase. The remote 
attachment site of £ on a via a long flexible tether 
suggests that the mechanism for transition between poly- 
merization and proofreading modes in Pol III is funda- 
mentally different from those in other polymerases whose 
structures in both modes are known or can be reliably 
modeled (29,30). 

MATERIALS AND METHODS 

The ^^N- and ^^N/^^C-labeled amino acids and a mixture 
of ^^N/^^C-labeled amino acids were from Cambridge 
Isotope Laboratories (Andover, MA, USA). />-Benzoyl- 
L-phenylalanine (Bpa) was from Peptech (BurHngton, 
MA, USA). All other standard reagents required for 
cell-free protein synthesis were as described previously 
(31,32). New plasmids for overproduction of proteins or 
their cell-free synthesis were derivatives of the T7 
promoter vectors pETMCSI, pETMCSII or pETMSCIII 
(33) and were constructed by standard methods, usually 
involving restriction digestion of PCR products and their 
insertion between corresponding sites in appropriate 
vectors (see Supplementary Methods for full details). 
Inserts in all plasmids were confirmed by nucleotide 
sequence determination. 

In vivo protein expression and purification 

The Pol III subunits a, 0 (9), q^l and £l (25) and the P2 
sliding clamp (34) were purified as described. The q^l^l^ 
core complex was isolated essentially as described 
for wild-type core (25,35). ^^N-£l, ^^N,^^C-£l86 and 
^^N,^^C-£l93 were expressed in vivo in M9 minimal 
medium containing ^^NH4C1 and/or ^^C-glucose; 
^H^^C-el86, ^^N,^^C-el93 and their complexes with 0 
were purified essentially as described for £l86 and the 
£l86:^ complex, respectively (36). The £CTS-a270 fusion 
proteins (constructs A; Figure IC) and His6-«GL 
(Figure ID) were expressed in vivo and purified as 
described in Supplementary Methods. BpaRS was as 
described (31). Protein concentrations were determined 
spectrophotometrically using calculated values (37) of £280- 

Isotope labeling of a270 and a270 or sCTS in the 
a270:£CTS complex 

Plasmid pK01367 was used at a concentration of 
16|igml~ for cell-free synthesis of a270-His6 in 0.6ml 
reaction mixtures at 30 C overnig ht. Five ^^N-labeled 
a270-His6 samples were prepared following the combina- 
torial labeling scheme and reaction conditions described 
previously (32,38-40). The soluble fraction of CY270-His6 
was purified using ProPur IMAC Mini Ni-spin columns 
(Nalgene Nunc, USA), and the purified protein was 
dialysed against 21 of NMR buffer (20 mM Tris.HCl, pH 
7.0, 150 mM NaCl, ImM EDTA, 1 mM dithiothreitol) 
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and concentrated to a final volume of about 0.2 ml using 
Millipore Ultra-4 centrifugal filters (MWCO lOkDa). 
D2O was added to a final concentration of 10% (v/v) 
prior to NMR measurements. In the same way, five sam- 
ples of combinatorially ^^C- and uniformly ^^N-labeled 
Qf270-His6 were made by cell-free synthesis, using the 
requisite mixtures of isotope labeled amino acids (41). 
For improved sensitivity in the NMR experiments, two 
0.6 ml reactions were pooled for each sample. Uniformly 
^^C/^^N-labeled a270-His6 was made by cell-free synthesis 
in two 0.6 ml reactions, using a mixture of labeled amino 
acids as described (42). 

Plasmid pK01422 was used at 16|igml~^ for cell-free 
synthesis of sCTS (Figure IB). All samples of the a270- 
His6:£CTS complex were purified and prepared for NMR 
as described above for CY270-His6. One set of five samples 
contained combinatorially ^^N-labeled a270 in complex 
with unlabeled eCTS; they were made by cell-free synthesis 
of sCTS in the presence of the combinatorially ^^N-labeled 
(y270-His6 samples described above. A second set of five 
samples contained combinatorially ^^N-labeled sCTS in 
complex with unlabeled CY270-His6; the sCTS was 
produced by cell-free synthesis in the presence of un- 
labeled a270-His6, which had itself been synthesized in a 
separate cell-free reaction. A third set of five samples con- 
tained combinatorially ^^C- and uniformly ^^N-labeled 
eCTS in the presence of separately purified and unlabeled 
Qf270-His6. Cell-free synthesis of these five samples used 
two 0.6 ml reaction mixtures. 

NMR spectroscopy 

All NMR spectra were recorded at 25 C using Bruker 600 
and 800 MHz NMR spectrometers equipped with 
cryoprobes, using 200 |il solutions in 3 mm sample tubes. 
^^N-HSQC spectra used ^i^ax = 32 ms, ^2max= 102 ms 
and total recording times of 1-13 h. 2D HN(CO) spectra 
and 3D HN(CO)CA and HNCA spectra were recorded in 
20-24 h per spectrum. D2O was added to all samples to a 
final concentration of 10% (v/v) prior to NMR 
measurements. 

Crystallography 

The £CTS35-«270(L2 1 P) and £CTS44-«270(L2 1 P) 
proteins were concentrated to 9.5-10 mgml~\ respect- 
ively, by precipitation with ammonium sulfate (0.35 g 
ml~^); the pellets were dissolved in and extensively 
dialysed against 10 mM Tris.HCl (pH 7.6), 1 mM 
EDTA, ImM dithiothreiotol, 0.1 M NaCl. The crystals 
used for data collection were grown at 4 C in sitting 
drops with 4.5 |il of protein mixed with an equal volume 
of reservoir solution of 0.1 M Tris (pH 8.4), 0.2 M MgCl2, 
3mM tris(carboxyethyl)phosphine (TCEP), 16% (w/v) 
PEG 3350. Rectangular prisms 300-400 |im in length 
appeared within 3-4 days. They were cryoprotected by 
two transfers (5min each) in reservoir solution supple- 
mented with 15% (w/v) PEG 400 before being frozen 
for data collection at 100 K. X-ray data were collected 
on Beamline MXl [£CTS35-Qf270(L21P)] or MX2 
[£CTS44-a270(L21P)] at wavelengths of 0.96858 and 
0.95369 A, respectively. 



The structure of £CTS35-cy270(L21P) was solved at 
1.7 A resolution by molecular replacement, using the cor- 
responding domain from the reported structure of E. coli 
a9\l (10) as starting model to calculate phase informa- 
tion. The structure of £CTS44-a270(L21P) was subse- 
quently solved at 2.15 A resolution using the refined 
structure of £CTS35-a270(L21P) as starting model. Final 
models were obtained following cycles of refinement using 
REFMAC (43) and manually building using COOT (44). 
Data collection and refinement statistics are given in 
Supplementary Table S 1 . 

DNA templates for site-directed Bpa mutants of a270 

For site-specific incorporation of the unnatural amino 
acid Bpa into a270, amber stop codons were engineered 
at the corresponding sites of the dnaE( 1-270 ) gene; 
primers used are Hsted in Supplementary Methods. The 
first five amber mutations were created by the Phusion 
site-directed mutagenesis kit (Finnzymes, Finland), and 
the genes were inserted between the Ndel and EcoKl 
sites of the T7 promoter vector pRSET-6b (45). The re- 
sulting plasmids pK01481-1485 have the codons of Pro4, 
Asp25, Asp75, Gin 106 and Lys229, respectively, replaced 
by amber codons and were used as DNA templates in cell- 
free synthesis reactions. Linear templates for cell-free syn- 
thesis of additional amber mutants (codons for Argl75, 
Tyr234 and Gln237) were generated by strand overlap 
PCR as described (46) using Vent DNA polymerase with 
outside primers and pairs of mutagenic primers (see 
Supplementary Methods). The PCR products were separ- 
ately purified from an agarose gel using NucleoSpin 
Extract II kits (Macherey-Nagel, Germany). T7 
promoter and terminator sequences were appended in 
two further separate PCR reactions (50|il each) (46) 
with a mixture of 20-30 ng of purified PCR products 
from the previous step. Mixing of two sets of primer 
pairs in approximately equimolar ratio, removal of the 
residual primers by the NucleoSpin kit, denaturation at 
95 C (5min) and reanneaUng at room temperature 
(5min) yielded DNA with complementary 8-nucleotide 
overhangs suitable for cyclization by the intrinsic Hgase 
activity of the cell-free extract. 

Complexes of Bpa mutants of a270 and the eCTS 

Cell-free reactions were as described (31,32), with added 
Bpa (ImM) and BpaRS (4-15|iM), as required. Plasmid 
templates pSH1017, pK01367 and pK01422 were used at 
16|igml~^ for production of £, Ck:270-His6 and sCTSsg 
(Figure IB), respectively. The reannealed amber mutant 
PCR products described above were used as template at 
~10|igml~^ The protein complex between CY270-His6 
(Bpa mutants or wild-type) and the eCTS were made by 
simultaneous cell-free synthesis of the two proteins in the 
same reaction mixtures. The a270-His6:£:^ was produced 
by making e in the presence of purified CY270-His6 and 0 or 
by co-synthesis of a270-His6 and £ in the presence of sep- 
arately purified 0. The complex of a270 with sCTS-Bpa- 
His6 was made by cell-free co-synthesis of these partner 
proteins in the same reaction mixture. The reaction 
mixtures were then clarified by centrifugation (100 000 
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1 h) at 4 C. The supernatants were loaded onto ProPur 
IMAC Mini Ni-spin columns and the complexes were par- 
tially purified by virtue of the C-terminal His6-tag of a270. 
The purified complexes were concentrated to ~0.1ml 
using Millipore Ultra-4 centrifugal filters (MWCO 
lOkDa), replacing the buffer with lOmM sodium phos- 
phate (pH 6.8), 100 mM NaCl, 1 mM dithiothreitol for 
photo-crossHnking experiments. 

Photo-crosslinking and LC-MS/MS analysis of 
crosslinked adducts 

The isolated wild-type and Bpa-containing protein 
complexes (5-8mgml~^) were irradiated at 312nm for 
1 min using a Mini UV transilluminator system BTS 
20 M (GAS700X) (UVItec, UK) and subsequently 
analysed by SDS-PAGE. The photo-crosslinked a270:£ 
or cy270:£CTS adducts were analysed by LC-ESI-ion trap 
mass spectrometry /mass spectrometry (LC-MS/MS) using 
a described protocol for in-gel trypsin digestion of gel- 
fractionated proteins (47). The solution containing the 
tryptic peptides that diffused from the gel pieces was 
desalted using CIS Zip-tips (Millipore), dried in a desic- 
cator and dissolved in 20 |il of 15% acetonitrile/1 % formic 
acid for LC-MS/MS analysis using an Agilent 6530 
Accurate Mass Q-TOF LC/MS. 

Preparation of the Hisg-aGL'^^N-^Li^ complex and 
titration with ^2 

A mixture of ll.Omg of purified His6-aGL5 2.7 mg 0 and 
5.6 mg ^^N-£l was treated at 0 C for 1 h, then dialysed 
against 50 mM HEPES-KOH (pH 7.5), 300 mM NaCl, 
20 mM imidazole, 5% (v/v) glycerol (buffer A). The 
sample (10 ml) was separated from excess 0 and £l on a 
5 ml Ni-NTA column in buffer A (eluted in a linear 
20-500 mM imidazole gradient). The His6-aGL-^^N-£L:^ 
complex was concentrated to 100 |iM in NMR buffer 
using Millipore Ultra- 15 centrifugal filters (MWCO 
lOkDa) and stored at — 80°C. Sample purity was 
assessed by 15% SDS-PAGE. ^^N-HSQC spectra were 
recorded before and after addition of concentrated ^2 
(separately dialysed in NMR buffer) to 50, 100, 150, 
200, 300 and 400 |iM (as dimer). 

The His6-aGL-^^N-£L^^^y^2 complex was separately 
isolated from a mixture of 100 |iM His6-aGL- ^N-£l-^ 
and 400 |iM ^2 by gel filtration (Supplementary Figure 
SI). The protein complex was concentrated to about 
40 jiM in NMR buffer using Millipore Ultra-4 centrifugal 
filters (MWCO lOkDa) and stored at -80° C. 

NMR titration of ^^N/^C-^:^ complexes with ^2 

Complexes of ^^N,^^C-£l86 and ^^N,^^C-£l93 with 
purified unlabeled 0 were dialysed into NMR buffer and 
concentrated using Amicon Ultra-4 centrifugal filters 
(MWCO lOkDa, Millipore). Resonances in the 
^^N-HSQC spectra of £l86 and £l93 in the two complexes 
with 0 were assigned by reference to previous assignments 
of £l86 in £l86:^ (BioMagRes database entry: bmrb6184), 
our 8 assignments in the aeO complex (9) and new experi- 
mental data for residues Alal88, Glnl82-Alal86 and 
Thrl93 obtained from 2D HN(CO) spectra and 3D 



HN(CO)CA and HNCA spectra. Phel87 was assigned 
through combinatorial labeling with ^^N and ^^N/^^C to 
identify the Alal86-Phel87 dipeptide. The assignment of 
resonances in the CBM of £ 193 (Gln-Thr-Ser-Met-Ala- 
Phe) were confirmed using cell-free residue-specific 
^^N-labeling of wild-type and two CBM mutants of 
£l93, that is, 8i^\91> (CBM: QLSLPL) and £q193 (CBM: 
ATSMAF) (25), with these six amino acids, and of £l with 
^^N-Leu, all in the presence of excess unlabeled 0. 

NMR titration of the ^^N,^^C-el86: 0 complex (100 |iM) 
with ^2 was made by recording of ^^N-HSQC spectra 
before and after progressive addition of concentrated 
purified ^2 to 100, 200, 300 and 400 |iM, whereas the 
^^N,^^C-£i93:^ complex (34|iM) was similarly titrated 
with 34 and 68 jiM ^2- Spectra were also recorded of the 
^^N-Gln, Thr, Ser, Met, Ala, Phe labeled £l93:unlabeled 0 
(27 |iM) sample with and without added ^2 at 30 |iM. 

Small-angle X-ray scattering 

A mixture of the oti^SiO core (Img) and P2 (2mg) was 
dialysed into buffer C (50 mM Tris.HCl pH 7.6, 1 mM 
EDTA, 1 mM dithiothreitol, 10% v/v glycerol) containing 
lOOmM NaCl. The stoichiometric ai^£i^0:fi2 complex was 
separated from excess P2 by anion exchange chromatog- 
raphy on a MonoQ 5/50 GL column (GE Healthcare) 
using a gradient of 0.1-1.0 M NaCl in buffer C, 
concentrated to 2.15mgml~^ using an Amicon Ultra 
0.5 ml centrifugal concentrator (Millipore) and stored 
frozen at —80 C. SDS-PAGE was used to confirm the 
presence of all four subunits. 

Scattering data were recorded on the SAXS/WAXS 
beamline at the AustraHan Synchrotron. The complex 
was analysed by size-exclusion chromatography-coupled 
small-angle X-ray scattering (SEC-SAXS). The complex 
was dialysed into 50mM Tris.HCl pH 8.0, 0.1 M NaCl, 
ImM EDTA, 1 mM TCEP, 5% (v/v) glycerol and 70|il 
were injected at 0.5mlmin~^ onto a Wyatt WTC-030S5 
SEC column (7.8 x 300 mm) equilibrated at 12 C in the 
same buffer. ^280 of the eluate was monitored immediately 
prior to its passage through a quartz capillary that was 
illuminated by a collimated 1 1 keV X-ray beam, 
/l= 1.1 27 A. Scattering from the sample was measured 
by a Pilatus 1 M detector (Dectris, Switzerland) that 
recorded 2D scattering images in 2 s exposures from a 
position 3349 mm behind the sample. For all the frames 
used for the data analysis, no protein damage induced by 
X-rays was observed. Scattering from the eluate was stable 
as averaged from 10 exposures prior to and after elution of 
protein, to give the buffer scattering. Following radial 
averaging and buffer subtraction, the radii of gyration, 
Rg, of five exposure bins were determined by Guinier 
analysis using AUTORG (48) and plotted against 
elution volume. Sample scattering was averaged across 
the region of Rg stability, which corresponded to the 
main UV absorption peak in the elution profile and en- 
compassed 20 exposures. The scattering pattern was 
truncated within the range 0.012 < Q < 0.16 A~^ The the- 
oretical SAXS patterns, radii of gyration and envelope 
volumes of various atomic models were calculated using 
CRYSOL (49), for comparison with experimental results. 
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RESULTS 

Cell-free but not in vivo expression of the PHP domain of 
a yields correctly folded protein 

As full-length E. coli a is prone to proteolysis and is un- 
suitable for crystallization (10), we studied the interaction 
between domains of a and £. It had been shown that the 
N-terminal 320-residue fragment of a binds £ (19), but the 
PHP domain of a defined subsequently ends already at 
residue 270 (10). However, a soluble construct comprising 
the N-terminal 270 residues (cy270) produced in mvo 
appeared to be unfolded as indicated by the ^^N-HSQC 
NMR spectrum of a ^^N-labeled sample (Supplementary 
Figure S2A). In contrast, samples made by cell-free 
protein synthesis routinely showed a chemical shift disper- 
sion characteristic of a well-structured globular domain 
(Supplementary Figure S2B). Therefore, we subsequently 
produced all samples of a270 by cell-free synthesis. 

Coarse mapping of the interface between a270 and the 
eCTS by NMR 

Expression of a construct comprising the flexible 59 
C-terminal residues of £ preceded by a 22-residue tag 
(fiCTSsg; Figure IB) in the presence of ^^N-labeled a270 
led to a soluble complex that retained the overall chemical 
shift dispersion of a270 with some significant chemical 
shift changes, as expected for specific binding 
(Supplementary Figure S3 A). 

We also used NMR spectroscopy to map the binding 
site of the sCTS on ck:270. As the stability and concentra- 
tion of a270 samples was insufficient for conventional 
triple-resonance NMR experiments, combinatorial 
labeling was used to obtain resonance assignments 
(38,39). Five samples were prepared, in which different 
residues of a270 were labeled with ^^N in different com- 
binations (Supplementary Figure S4A and B), allowing 
the residue-type identification of the ^^N-HSQC cross- 
peaks. In addition, five samples were prepared with com- 
binatorial ^^C-labeHng and uniform ^ N-labeHng. 2D 
HN(CO) NMR spectra of these samples provided the 
residue- type information of the preceding amino acid for 
each ^^N-HSQC cross-peak (Supplementary Figures S3B 
and S4C) (50). In combination, the 10 samples provided 
sequence-specific resonance assignments for 50 ^^N-HSQC 
cross-peaks arising from amino acid pairs that are unique 
in the sequence of a270 (Supplementary Table S2). 

A second set of five combinatorially ^^N-labeled 
samples of a270 was prepared in complex with unlabeled 
£CTS59 and ^^N-HSQC spectra were recorded (data not 
shown). Significant chemical shift changes occurred 
throughout the a270 domain, but the two largest were 
observed for amides within 15 A of its N- and 
C-terminal ends (Supplementary Figure S3). This 
indicated that in contrast to a270, a fusion construct of 
it with the eCTS could be a stable, well-folded protein. 

The ^^N-HSQC spectrum of the ^^N-eCTSsg construct 
in complex with unlabeled cy270 displayed many narrow 
lines, indicating that much of it is highly mobile and not 
tightly interacting with a270. For resonance assignments, 
£CTS59 in the complex with unlabeled a270 was 



combinatorially ^^N-labeled (Supplementary Figure S5A) 
and, in a second set of five samples, labeled combinator- 
ially with ^^C and uniformly with ^^N. 3D HNCA and 
HN(CO)CA experiments with the second set were used 
to assign most of the flexible amino acid residues in the 
£CTS59 construct. Like 2D HN(CO) experiments, the 3D 
HN(CO)CA spectra of the combinatorially labeled 
samples identified for each ^^N-HSQC cross-peak the 
amino acid-type of the preceding residue. In addition, 
the HN(CO)CA spectra delivered its Qa chemical shift 
which, together with the HNCA spectrum, provided 
more secure resonance assignments than could have 
been obtained from a sing le uniformly ^^N/^^C labeled 
sample. The final resonance assignments corresponded 
to the segment from Phel87 to Ala209. In addition, two- 
thirds of the residues of the non-native N-terminal 
22-residue tag of the eCTSsg construct were assigned in 
the complex (Supplementary Table S3). The narrow Hne 
widths and random coil chemical shifts of these residues 
indicate high flexibility as expected. Most notably, no 
signals could be observed for the C-terminal segment 
(Ser210 to Ala243) of the eCTS, except for the amides of 
Glu221 and Gly237, indicating immobilization by tight 
association with a270. The delineation of the flexible 
residues in the sCTS was used to design fusion constructs 
of the eCTS with a270. 

Crystal structures of intramolecular a270-£CTS 
complexes 

The a270 domain was fused to the sCTS (residues 
209-243) via a nine-residue Hnker, where the sCTS was 
N-terminal of a270 in construct A and C-terminal of 
a270 in construct B (Figure IC). To assess the impact of 
the fusion on the structural integrity of a270, we 
compared the ^^N-HSQC spectra of selectively 
^^N-alanine labeled samples of constructs A and B with 
corresponding spectra of the similarly ^^N-alanine labeled 
non-covalent a270:£CTS59 complex. All samples were 
readily produced in soluble form by cell-free synthesis. 
The NMR spectrum of construct A was much more 
similar to the spectrum of the a270:£CTS59 complex 
than the spectrum of construct B (Supplementary Figure 
S6A and B), suggesting that construct A is a stable intra- 
molecular af270:£CTS complex. 

Both constructs were expressed in vivo, and could be 
purified in soluble form for use in crystallization trials, 
but neither yielded crystals. Crystals were obtained, 
however, for a variant of construct A that contained a 
fortuitous point mutation in the a270 domain, 
Leu21Pro. Some of the cross-peaks were broad or 
missing from the ^^N-HSQC spectrum of the 
^^N-alanine labeled mutant protein, suggesting conform- 
ational exchange broadening of some of the NMR signals 
(Supplementary Figure S6C and D). The crystal structure, 
however, shows no evidence of conformational heterogen- 
eity, which may be explained by crystal contacts involving 
Pro21 leading to preferential crystallization of a particular 
conformer; in all four molecules in the asymmetric unit, 
Pro21 makes a crystal contact with an alanine residue. 
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Figure 1. Protein constructs used in the present work. (A) The £ 
subunit is the proofreading 3^-5' exonuclease of Pol III. It binds 
tightly to the a subunit via its C-terminal segment (eCTS). The 
globular domain of e, £l86 (12) binds to the 0 subunit (15,16). The 
eCTS comprises residues 181-243, including the /3 CBM at residues 
182-187 (in red); the quadruple-mutant T183L/M185L/ A186P/F187L 
(sl) has a strengthened CBM (25). The a-binding site hes in the 
C-terminal part of the eCTS. At least 19 residues between el 86 and 
the Qj-binding site (i.e., between Thrl83 and Thr201) remain flexible in 
the aeO complex; Ala and Thr residues for which flexibihty was estab- 
lished are indicated by asterisks (9). We refer to the segment comprising 
residues 190-209 (in orange) as the Q-linker (21). The present work 
reports: (i) that residues between Phel87 and Ala209 of eCTS 
(indicated by asterisks) remain flexible in the complex of eCTSsg and 
the PHP domain of a, a270 (Supplementary Table S3); (ii) the 3D 
structure of the complex between the eCTS (in purple) and the PHP 
domain of a (Figure 2), showing that part of the a-binding segment of e 
forms a helix (residues 218-237) upon binding; and (iii) that residues 
between Glul90 and Arg204 (indicated by asterisks) remain flexible in a 
stabilized mutant version of the ae6\^2 complex (Figure 4). The 
C-terminal residues of the el86 and £l93 constructs are indicated. 
(B) Amino acid sequence of the eCTSsg construct. Residues 185-243 
of e are labeled with the sequence numbers of full-length e. The 
preceding residues in itahcs are not part of e; they comprise a T7 
gene 10 tag (resulting in the N-terminal peptide MASMTG) for 
improved cell-free expression yields and a biotinylation site. (C) The 



The 1.70 A crystal structure of the a(Leu21Pro) mutant 
of construct A was solved by molecular replacement using 
the Qf270 domain from the structure of Qf917 (PDB: 2HQA) 
(10) as starting model (Supplementary Table SI). The four 
molecules in the unit cell show a maximum Ca RMSD of 
0.284 A in pairwise aUgnments, and in all four the struc- 
ture is fully ordered from Lys211 of eCTS (numbered 
throughout as in full-length b) through the linker region 
and the entire a-PHP domain to the final residue, Thr270. 
In one of the monomers, the backbone and side chain 
of Ser210 is also ordered, and in another Lys211 has al- 
ternate side chain conformations. 

The refined structure of the £CTS35-a270 protein shows 
the fiCTS assuming an extended structure across one face 
of a270, followed by an a-helix. The C-terminal residues 
that follow are located in a pocket formed by a270 and the 
a-helix of the ^CTS (Figure 2). The sCTS portion is fully 
structured and in contact with a between residues Lys211 
and the C-terminus of e; electrostatic and H-bonded 
contacts between a and 8 are listed in Supplementary 
Table S4, and these are complemented by a much larger 
number of hydrophobic interactions. Although it is 
engaged in crystal contacts, the Hnker peptide connecting 
the sCTS with the a270 domain is solvent exposed, so that 
the structure of the complex is unlikely to be affected by 
its length or conformation. The mutated residue Pro21 of 
cy270 is far removed from the sCTS binding region on the 
opposite face of the PHP domain (Figure 2A) that is in 
closer proximity to the polymerase active site in the a9\l 
structure (10). 

To confirm the NMR data that eCTS in complex with 
cy270 is indeed unstructured in the linker preceding 
Ala209, we made a longer type A fusion construct 
commencing at Ala200 (i.e., £CTS44-a270), crystallized it 
under similar conditions, and solved its structure at 2.15 A 
(Supplementary Table SI). In two of the four chains in the 
asymmetric unit, residues preceding Lys211 were still 
disordered, but in one of the other two, weak electron 
density was interpreted as the tetrapeptide segment 
Ile202-Ile205 that has additional interactions with the 
region around Hisl83 and Asp252 of a270 (Figure 2 
and Supplementary Table S4), and Ser210 was also 
fully ordered. Because the Val206-Ala209 segment is 



Figure 1. Continued 

first 270 residues of a (a270) contain the PHP domain. To determine 
the 3D structure of the eCTS in complex with a270, eCTSss (residues 
209-243) or eCTS44 (residues 200-243) was fused to either the 
N-terminus (constructs A; £CTS35-q;270 and £CTS44-q;270) or the 
C-terminus (construct B; a270-£CTS35) of q;270. The amino acid 
sequence of the nine-residue linker is similar to one that had been 
determined to be flexible in another context (27,28). (D) Purification 
of the ae6\^2 complex is difficult due to the limited affinity of ^2 for the 
aeO core. Increased affinity between a and ^2 (for SAXS measurements) 
was achieved by changing residues 920-924 (internal CBM) from 
QADMF to QLDLF (26) to produce a mutant we refer to as q^l 
(25), and then (for NMR measurements) introducing a further 
Val832Gly mutation (25,54) to yield q^gl; c>^gl also contains an 
N-terminal His6 tag for purification. The figure shows the sites of 
Val832 and the internal CBM plotted in blue on the structure of the 
a subunit from T. aquaticus (11). The PHP domain is shown in orange 
and the Mg^^ ion in the active site in magenta. 
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Figure 2. 1.7 A crystal structure of construct A (eCTS35-Q;270) with the 
Leu21Pro mutation in allO. In the ribbon diagram in (A), £0X835 and 
a270 are in green and cyan, respectively, while the nine-residue Hnker 
between the eCTS and allO is in yellow. The location of residue 21 (in 
magenta) and the N- and C-termini of a270 (aMl, aT270) are 
indicated. The eCTSss region is fully structured from Lys211-Ala243, 
and forms an oj-hehcal segment between Thr218 and Gly237. The 
additional contacts of residues Ser210 and Ile202-Ile205 of eCTS 
with Qf270 in the 2.15 A structure of £CTS44-a270(L21P) are also 
indicated. (B) A view in the same orientation with q;270 in space- 
filling representation (gray) and side chains of selected residues of 
eCTS (green) shown as sticks (green). Residues Ser210 to Ala217 
of eCTS form an extended structure that hes in a groove in allO. A 
Hst of H-bonding and electrostatic contacts between residues in a and e 
is given in Supplementary Table S4. 

disordered, we are unable to tell if this ^CTS tetrapeptide 
derives from the same molecule to which it is bound, or 
from a neighboring molecule in the crystal lattice, and the 
NMR data show that this segment is inherently flexible in 
solution. Although it seems probable that these additional 
interactions are rather transient in the context of the full 



a^O core complex, they nevertheless indicate where in 
space the flexible segment of £ (between Ala 188 and 
Ala209) is likely to reside, at least in the closed form of 
the asO:P2 complex during DNA synthesis (25). 

The PHP domain of a in the crystal structures of con- 
struct A and in cy917 (10) is fully conserved structurally, 
except around the site of the Leu21Pro mutation, with a 
backbone RMSD of 0.532 A over 253 Ca atoms. This 
allows straightforward modeling of the eCTS onto the 
structure of a9ll. As discussed in detail below (see also 
Supplementary Movie SI), the C-terminus of s binds to a 
in a position that would place the extended peptide 
segment immediately preceding the C-terminal helix of s 
and the exonuclease active site far from that of the poly- 
merase. Its unusually remote location raises questions 
about how the N-terminal exonuclease domain of £ 
gains access to a mismatched primer terminus when proof- 
reading is required. Thus, we sought to obtain further in- 
formation about the location in the complex with a of the 
linker peptide that extends in s from the CBM (i.e., from 
Ala 18 8) to the structured part in the crystal structures 
above. 

Photo-crosslinking to localize the flexible peptide segment 
of the sCTS on a270 

In agreement with the NMR evidence for high mobihty of 
residues prior to Ala209, the crystal structures of sCTS in 
the two type A constructs showed consistent electron 
density only from Lys211 onwards. To explore the 
location of the flexible residues of the Q-linker, we 
introduced the unnatural amino acid /?-benzoyl-L-phenyl- 
alanine (Bpa) at different sites in allO, using the orthog- 
onal Methanococcus jannaschii system developed by P.G. 
Schultz and co-workers, where the site of Bpa incorpor- 
ation is encoded by an amber stop codon (51). The puri- 
fication of the mutants was facihtated by producing the 
protein in a cell-free system, which rehed on purified 
plasmid DNA with amber stop codons, purified Bpa- 
tRNA synthetase and a total tRNA preparation that con- 
tained the amber suppressor tRNA (31). Most Bpa 
mutants were produced in yields of up to 1.5mgml~^ in 
7h without evidence of truncation at the amber stop 
codon (Supplementary Figure S7A). Full-length proteins 
were readily purified using a Ni-NTA spin column, as all 
cy270 mutants carried a C-terminal His6-tag. The produc- 
tion of the Ala25Bpa mutant, which was initially ex- 
pressed in low yields, was improved dramatically by 
using the optimized (52) amber suppressor tRNA°^^ and 
doubling the amount of total tRNA (31). 

Cell-free expression of £ in the presence of the purified 
Bpa mutants of a270 and of separately purified 0 
produced stable soluble complexes that could be purified 
as shown previously for wild-type a (9). Similarly, soluble 
complexes of the a270 Bpa mutants with the sCTSsg 
construct were obtained by cell-free co-expression of the 
a270 mutants and of sCTSsg (Supplementary Figure 
S7B and C). 

UV irradiation (312 nm, 1 min) effects photo- 
crosslinking of Bpa to nearby residues (<3 A) (53). SDS- 
PAGE revealed crosslinking with full-length s when Bpa 
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was located in positions 19, 21, 23, 25 and 229 of a270, but 
no crosslinks were observed with Bpa at positions 4, 75 
and 106 (Supplementary Figure STB and data not shown). 
Mass spectrometric analysis of in-gel tryptic digests con- 
firmed that the crosslinks were with the eCTS rather than 
the globular N- terminal domain of s. Additional experi- 
ments were carried out with the £CTS59 construct to elim- 
inate the need to exclude binding to the N-terminal 
domain of e. Bpa mutants at positions 175, 229, 234 and 
237 displayed crosslinks with sCTSsg (Figure 3; 
Supplementary Figure S7C). The wide distribution of 
crosslinking sites across the surface of allO confirms the 
NMR observation of high flexibihty in the linker segment 
of s before the C-terminal a270-binding region. Most 
interestingly, the Lys229Bpa mutant readily crosslinked 
with a peptide segment preceding Gin 196 in sCTSsg 
(Supplementary Figure S8), although Lys229 is located 
on the opposite face of the PHP domain compared to 
the binding site of the C-terminus of s. Therefore, the 
Q-Hnker region of the sCTS readily wraps around the 
PHP domain of a but is not poised for specific binding 
interactions with the PHP domain. 

Photo-crosslinking experiments between allO and the 
eCTS were also conducted with an sCTS construct that 
was extended at its C-terminus by Bpa-His6. MS analysis 
of a tryptic in-gel digest of the cross-linked complex 
revealed Hnkage to the segment of residues Ala31-Lys52 
in a (data not shown). This result is in agreement with the 
crystal structure, which positions the^ peptide linker 
between the eCTS and allO within 1 1 A of the peptide 
identified by MS. 

The fiCTS Q-linker remains flexible in the (xsO'.Pj complex 

The 8 subunit harbors a CBM immediately following the 
exonuclease domain (i.e., residues 182-187) (25), and a 
also contains a CBM between residues 920 and 924 (24). 
Although each binding interaction is individually weak, 
the cooperativity of binding of a to one subunit of the 
^2 dimer and of s to the other maintains the integrity of 
the a£0:fi2 repHcase complex with DNA during highly 
processive DNA replication (25). The questions remain 
whether such a binding arrangement is compatible with 
the available structural information on the replisome 
subunits and whether the sCTS can accommodate this 
arrangement. To investigate the structural confinement 
of the fiCTS in the asO:P2 complex, we studied the NMR 
spectrum of the otoi^^ii^'-Pi complex, where cygl and £l are 
mutants of a and £ with improved binding affinities to ^2 
(Figure 1 A and D); in addition to strengthening mutations 
in the CBM (as in a^) (25,26), q^gl also contains an add- 
itional mutation (Val832Gly; spq-2) (25,54) that by itself 
strengthens binding of aaO to ^2 (S.J. and Thitima 
Urathamakul, unpublished). For example, a peptide 
with an optimized CBM as in £l interacts about 500-fold 
more strongly with ^2 than the wild-type CBM of 8 (25), 
while the mutations (A921L, M923L) in the CBM of a 
strengthen binding to ^2 120-fold (26). For NMR meas- 
urements, Q^GL^L^ was made with uniformly ^^N-labeled £l 
and mixed with a 3 -fold excess of ^2'^ the stoichiometric 




Figure 3. Sites in a270 where Bpa residues were introduced for photo- 
crosslinking with £ to detect proximity to residues for which no struc- 
tural information was obtained by the crystal structure in Figure 2. The 
location of the eCTS determined by the crystal structure is shown in 
yellow, with the nine-residue linker peptide in green. The C/3 atoms of 
residues at sites leading to efficient, less efficient or no crosshnking are 
highlighted in red, magenta and cyan, respectively. 



<^GLSL^-y^2 complex was stable enough to be isolated by gel 
filtration (Supplementary Figure SI). 

The molecular mass of the ^gl^l^ and otG-L^ifi'-^i 
complexes is so high (165 and 245 kDa, respectively) that 
only highly mobile peptide segments can generate cross- 
peaks in ^ N-HSQC spectra. Remarkably, almost all the 
cross-peaks that could be observed for the ^gl^l^ 
complex (Figure 4A and Supplementary Figure S5B) 
could also be observed in the presence of ^2 (Figure 4B 
and C), although with generally decreased intensity as 
expected for the slower overall tumbling rate of the 
otG-L^-L^'-Pi complex (Supplementary Figure S9). The 
peaks did not arise from complexes with sub-stoichiomet- 
ric amounts of ^2^ as they were observed even in the 
presence of an excess of ^2- Assignments for many of 
these resonances (residues Glul90-Thr201 and Arg204) 
were obtained by comparison with spectra of a270 in 
complex with N-labeled fiCTS (Figure 4B), and 
indicate that the Q-linker is clearly still mobile when the 
CBM of 8 is tied to the ^2 clamp. 

The 8 subunit interacts with P2 only through the CBM 

Identification of the role of the CBM just following the 
structured domain of 8 in DNA replication (25), when 
combined with the structure of the sCTS in complex 
with the a-PHP domain (Figure 2) enables us to position 
the proofreader between the P2 clamp and PHP domain of 
a in the a£^:y02-DNA complex in the polymerization mode 
of DNA synthesis (25). We have previously shown by 
NMR that el 86 does not interact, even weakly, with a 
(9), and we now asked if 8 contains a second site for 
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Figure 4. A large segment of the eCTS remains flexible in the (XGi^ei6\^2 complex. The mutant subunits q^gl and £l were used in the complex to 
avoid dissociation of ^2 during purification using the N-terminal His6 tag on ^gl- (A) ^^N-HSQC spectrum of the q^gl^l^ complex with ^^N-labeled 
£l- Only amides from mobile residues are observable in the 165 kDa complex. (B) ^^N-HSQC spectrum of the oiqi^£i0\^2 complex with ^^N-labeled e. 
Resonance assignments obtained by comparison with spectra of a270 in complex with ^^N-labeled eCTS are indicated. Most if not all of the 
observable peaks can be attributed to the eCTS. The same set of amides from mobile residues is observable in the purified 245 kDa complex as 
in (A). The spectrum was recorded using a 0.1 mM solution of the complex at 25 C. (C) Superimposition of the spectra in (A) and (B) demonstrates 
that most chemical shifts remain conserved. 



interaction with ^2 that orients it precisely in the as0:l32 
complex. To do this, we made a new truncated version of g 
we call £l93 (residues 2-193), that contains all of the 
structured exonuclease domain and the CBM 
(Figure lA). We first used cell-free synthesis to prepare, 
in the presence of excess unlabeled 0, a sample of £l93 
(27|iM) that was ^^N-labeled only with amino acids that 
comprise the CBM (Gin, Thr, Ser, Met, Ala and Phe), and 
assigned these residues in the ^^N-HSQC spectrum of the 
whole £l93 protein as described in Materials and Methods 
section. Addition of to 30 |iM led to disappearance of 
signals corresponding to all residues of the CBM (Glnl82- 
Phel87), but no significant changes to the spectrum of the 
structured proofreading domain or of Ala 188 and Thr 193 
in the region beyond the CBM (Figure 5). These data are 
the first to directly show the interaction of the CBM in 8 
with P2 at single-residue resolution. 

We also isolated the complex of uniformly in vivo 
^^N,^^C-labeled £l93 with unlabeled 0, and recorded its 
^^N-HSQC spectra (at 34|iM) in the absence and 
presence of P2 at 34 and 68 \iM (data not shown). Once 
again, the only cross-peaks broadened beyond detection in 
the £l93 spectrum were those in the region of the CBM; 
peaks throughout the remainder of the spectrum did not 
shift and were only broadened at the highest concentra- 
tion of P2, consistent with the exonuclease domain being 
freely mobile in the complex with P2, except in the CBM 



that interacts directly with the clamp. In further support 
of the conclusion that e contains no site of interaction with 
P2 beyond the CBM, we were unable to detect any signifi- 
cant changes in the ^^N-HSQC spectrum of ^^N,^^C-£l86:6> 
(100 |iM) on addition of up to 400 |iM ^2- 

Structural modeling of the Pol III replicase complex in 
the polymerization mode 

The 3D structures of many components of the Pol III 
replicase complex are known from different bacterial 
sources, including three crystal structures of Pol III a: of 
E. coli a9\l (10), and of full-length Taq a alone (1 1) and in 
complex with primer- template DNA (55). The DNA-free 
protein structures are remarkably similar and reveal an 
open state that closes on binding primer-template DNA 
(discussed in 25). In addition, the crystal structures of an 
E. coli y02:dsDNA complex (23), £l86 (12) and the 
el86:HOT complex (13,14) are known, as well as the 
NMR structure of the el 86:^ complex (16). 

Combining these atomic-resolution structures with 
the present structure of the a270:£CTS complex and the 
identification of CBMs in a (24,26) and e. (25), we initially 
built a compact model of the replicase complex in the 
polymerization mode with the si 86 and d domains in 
available space between the p> clamp and the y^-binding 
domain of a (Figure 6A, Supplementary Movie SI and 
Supplementary Pymol Session File SI; model building 
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labeled £193:9 
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Figure 5. Isotope labeled el 93 in complex with purified unlabeled 6 interacts with ^2 only through the CBM. Superimposition of ^^N-HSQC spectra 
of uniformly in vivo ^^N/^C-labeled £l93:unlabeled 6 (black spectrum, selected resonance assignments in black) and of £l93:^ (27)iM) labeled 
specifically in el 93 with ^^N-glutamine, threonine, serine, methionine, alanine and phenylalanine in the absence (blue spectrum) and presence (red 
spectrum) of ^2 (30 jiM). Resonances were assigned as described in Materials and Methods. Cross-peaks were observed for 52 of 59 Gin, Thr, Ser, 
Met, Ala and Phe residues in the structured el 86 domain, and all were unaffected by addition of /32 (selected signals labeled in purple); those of Ser2 
and Thr3 in the disordered N-terminus could not be assigned, while AlalOO, Thrl28, Serl44, Alal64 and Thrl79 had low intensity even in the 
absence of ^2- Signals in the CBM that broaden beyond recognition in the presence of ^2 (red spectrum; i.e., Glnl82, Thr 183, Serl84, Metl85, 
Alal86, Phel87) are labeled in green, while assignments for flexible residues at the N- and C-termini (Ala4, Alal88 and Thrl93) that are unaffected 
by ^2 are labeled in orange. 



is described in Supplementary Methods). This model 
fulfils all the known restraints, including the current 
results that the Q-linker region in s is flexible and at 
least transiently close to Lys229 in a (Figure 3), and that 
residues 202-205 of s are transiently close to Hisl83 and 
Asp252 of a (Figure 2). In the model, the CBMs of a and s 
bind to different subunits of P2 and the exonuclease 
domain of s readily approaches the DNA, while the 
conformational space available to the Q-linker of s is 
sufficiently large to allow high mobility (Figure 6A). 
Since we have been unable to detect an additional point 
of contact of £ with either a (9) or (above), it may be that 
either (i) the globular £l86 domain remains mobile in the 
complex (it can still rotate in its position in this model 
without clashing with ^2 or a) or it is held in a fixed 
position through transient electrostatic contacts with 
the double-stranded portion of the primer-template 
DNA. Its precise position could potentially be defined 
by further crossHnking studies, but we note that as with 
our Bpa data, all crosslinking methods are inherently 
unsuitable for precise definition of positions of compo- 
nents of intrinsically dynamic complexes; they dem- 
onstrate where subunits can be, not where they 
necessarily are. 

A third possibiHty is that the exonuclease domain 
remains much more freely mobile in the complex during 
DNA synthesis, and is reoriented to an appropriate 
position during proofreading. The structured region of 
el 86 ends at Glyl80 and Gin 182, the first residue of the 
CBM, is bound in the protein-binding groove of ^2- 
Although the closeness of these residues restricts the 
space that can be occupied by sO, it is still possible for sO 



to rotate away from a:P2 to produce less compact and 
more mobile structures. 

Assessment of structural models using SAXS data 

The ai^8iO'.P2 complex, with both the CBMs in a and e 
strengthened, has been observed to be stable by ESI-MS 
under native conditions (25), and as with the correspond- 
ing complex containing q^gl, it can be isolated chro- 
matographically (see Supplementary Methods). To assess 
whether the cyl subunit in this stabilized replicase complex 
has a closed structure similar to that in our model 
(Figure 6A) even in the absence of primer-template 
DNA, we collected real-time gel-filtration SAXS data on 
the ai^8iO'.P2 complex at a synchrotron source (Figure 7) 
and compared it with predicted scattering curves for 
various structural models. 

Analysis of the data showed good agreement with the 
overall dimensions of an initial docking model with a 
closed a conformation, but indicated too compact 
packing of the bO subunits to the a chain (Figure 7B). 
An ensemble of 1000 alternate structures was generated 
by allowing free rotation around the backbone dihedral 
angles of Glyl80-Glnl82 in 8 while disallowing steric 
clashes with a'.P2 (Figure 6B, Supplementary Movie S2 
and Supplementary Pymol Session File S2). Averaging 
over the ensemble resulted in a markedly improved fit 
of the SAXS data at Q-values in the range of 
0.07-0. 12 A~^ (Figure 7), which suggests that is not 
restrained in a single conformation in the stabilized 
asO:P2 complex, at least in the absence of primer- 
template DNA. 
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Figure 6. Models of the ae6\^2 complex with primer-template DNA in 
the polymerization mode. Color coding: a (blue), with the PHP domain 
of the crystal structure superimposed in red, e (yellow), 6 (orange) and 
^2 (subunit in cyan contacting the CBM of e and that in magenta 
contacting the internal CBM of a). (A) Compact structure of a form 
of the 'closed' complex (25) with eO sandwiched between the ^ clamp 
and the PHP domain of a. Multiple conformations are displayed for 
the 22-residue linker segment connecting the a-bound portion of the 
eCTS with the globular exonuclease domain of £ (el 86) that was pos- 
itioned to bring its CBM in proximity to the protein-binding groove of 
^. All conformations of the hnker segments are sterically allowed, 
explaining the high mobihty observed in this segment experimentally. 
(B) It is possible to rotate e out of the complex into a more open 
structure while maintaining its contacts with the PHP domain of a 
and ^2- Multiple (other) sterically allowed exonuclease domain {£0) 
conformations are displayed; these represent a subset of the structures 
used to back calculate scattering curves in Figure 7. The view on the 
left is the same as in (A); that on the right is rotated 90 as indicated. 



DISCUSSION 

The crystal structure of the cy270:£CTS complex solved in 
the present work allows, for the first time, the building of 
informed models of the aG0:P2 replicase complex with 
primer-template DNA in the polymerization mode 
(Figure 6A). The high-affinity binding site of the sCTS 
on the PHP domain of a turned out to be surprisingly 
remote from the active site of the polymerase. The long 
Q-Hnker of 8 was found to be highly mobile even in the 
context of the a80'.P2 complex, readily accommodating a 
conformation that allows the CBM located in e near the 
C- terminus of the globular exonuclease domain (25) to 




Figure 7. SEC-SAXS measurement of ai^Si^6:^2 complex. (A) Elution 
profile showing the A280 (black continuous fine) and SAXS data 
including the forward scattering intensity 7(0) (green-dashed fine) and 
Rg (in blue) calculated by Guinier analysis of five-exposure bins; values 
of 7(0) vary linearly between 0.0007 and 0.0105 cm~^ To obtain the 
experimental SAXS pattern, exposures were averaged in the region of 
Rg stability (bounded by vertical red bars) before data reduction and 
buffer subtraction. (B) Experimental SAXS data for the ai^£i6:^2 
complex (scatter plot), for which the pair distance distribution (not 
shown) indicates R^ = 48.8 ± 0.2 A and maximum dimension of 
152.5 A. For comparison, the averaged theoretical scattering of 1000 
ae6\^2 models generated by free backbone rotation in segment £180-182 
(Figure 6B) is shown in red; these models have mean R^ = 47.6 A and 
mean envelope diameter of 154.2 A. The theoretical scattering of a 
typical as6:p2 model with a more open (loose) orientation of 60 is 
shown in blue; Rg = 48.3 A, envelope diameter = 152.9 A, and that of 
an as6:p2 model with a compact orientation of 60 is shown in green; 
T^g = 46.8 A, envelope diameter = 151.5 A. 



bind to the well-estabHshed protein-binding site of /S. It 
is intriguing to speculate that the exonuclease could 
swing a long distance from the DNA when its CBM is 
released from the ^2 clamp. Release of the s CBM from 
^2 would be a requirement for entry of other y^-binding 
proteins, including repair and translesion polymerases 
(56) into the repHcase complex, and might also occur in 
the transition from polymerization to proofreading 
modes (25). The distance between the polymerase and 
exonuclease active sites in all model structures in 
Figure 6 is >70A (average of 92.3 A for the ensemble 
in Figure 6B); that this distance is so large suggests it is 
very likely that the s-fi contact is broken during transi- 
tion to the proofreading mode, to allow a to assume a 
more open structure and access of s to the mismatched 
primer terminus. To compensate for the loss of binding 
affinity of the CBM, the conformational change in a 
could expose a cryptic-binding site for the £l86 domain 
(or 0), such that the exonuclease site is appropriately 
positioned for proofreading. In this scenario, binding of 
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sld>6:0 to either the ^2 clamp or to a would present a 
switch between the two modes that is fundamentally dif- 
ferent from that observed in simpler polymerases with an 
integrated proofreading domain, where the transition 
between polymerization and proofreading modes 
requires protein-mediated transfer of the 3^ end of the 
primer over a distance of 20-30 A between the polymer- 
ase and exonuclease active sites (29,30). In contrast, re- 
positioning of the exonuclease domain of Pol III over a 
sufficiently large distance is perfectly conceivable, as the 
el 86 domain does not interact to any appreciable degree 
with the fiCTS, a or, as shown here, to p (9). 
Proofreading would still require disengagement of the 
mismatched primer-template from the polymerase active 
site and sHding back of the ds DNA portion through the 
^2 clamp to allow access of the 3^ primer terminus to the 
exonuclease active site. 

On a technical note, cell-free protein synthesis proved to 
be a decisive tool in this project, as allO folded into a 
defined conformation when expressed by cell-free synthe- 
sis but not when it was produced in vivo. Furthermore, 
overexpression of full-length s in vivo leads to insoluble 
protein, which in the past could only be solubilized by a 
denaturation and refolding protocol (57). Cell-free synthe- 
sis of £ in the presence of its natural-binding partners 0 and 
a, however, circumvented this problem, readily yielding 
the stable ternary asO complex (9). Similarly, sCTSsg 
when expressed by itself was insoluble, but soluble 
complexes with a270 and mutants thereof were readily 
obtained by cell-free synthesis. Furthermore, this 
approach allowed efficient ^^N-labeHng of individual 
proteins in selective (58,59) and combinatorial (38,39) 
labeling schemes, providing a route to NMR resonance 
assignments of samples of limited solubility and stability. 
Finally, the cell-free approach is uniquely suited for the in- 
corporation of unnatural amino acids (31), in the present 
work affording the facile incorporation of the unnatural 
amino acid Bpa for photo-crossHnking. This method may 
present a useful tool to probe structures of larger 
replisomal complexes in the future. 



CONCLUSION 

The extraordinarily long flexible tether by which the 
globular domain of the proofreading exonuclease is 
attached to the polymerase subunit raises the expectation 
of large conformational changes involved in the transition 
from the polymerization to the proofreading mode. 
Future studies may attempt to probe this by single- 
molecule fluorescence resonance energy transfer (FRET) 
experiments. 
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